The Alpha 21364 network architecture

نویسندگان

  • Shubhendu S. Mukherjee
  • Peter J. Bannon
  • Steven Lang
  • Aaron Spink
  • David Webb
چکیده

Advances in semiconductor technology have let microprocessors integrate more than a 100 million transistors on a single chip. The Alpha 21364 microprocessor uses 152 million transistors to integrate an Alpha 21264 processor core, a 1.75-Mbyte second-level cache, cache coherence hardware, two memory controllers, and a multiprocessor router on a single die, as Figure 1a shows. In the 0.18-micron bulk CMOS process, the 21364 will run at 1.2 GHz and provide 12.8 Gbytes/s of local memory bandwidth and 22.4 Gbytes/s of router bandwidth, as Figure 2 shows. The Alpha 21364’s tightly coupled multiprocessor network connects up to 128 such processors in a 2D torus network; Figure 1b shows a 12-processor configuration. A fully configured, 128-processor, shared-memory system can support up to 4 terabytes of Rambus memory and hundreds of terabytes of disk storage. We could also easily redesign the 21364 to support a much larger configuration. This multiprocessor configuration supports the massive computation and communication requirements of various application domains, such as high-performance technical computing, database servers, Web servers, and telecommunications. We designed the Alpha 21364 network architecture to meet the communication demands of these memoryand I/O-intensive applications. The novelty of the Alpha 21364’s router architecture lies in its extremely low latency, enormous bandwidth, and support for directory-based cache coherence. The router offers extremely low latency because it operates at 1.2 GHz, the same clock speed as the processor core. The pin-to-pin latency within the router is 13 cycles or 10.8 ns. In comparison, the ASIC-based SGI Spider router runs at 100 MHz and offers a 40-ns pin-topin latency. Similarly, the Alpha 21364 offers an enormous amount of peak and sustained bandwidth. The 21364 router can sustain between 70 and 90 percent of its 22.4-Gbytes/s peak bandwidth. The 21364’s router can offer such enormous bandwidth because of aggressive routing algorithms, carefully crafted distributed arbitration schemes, large amounts of on-chip buffering, and a fully pipelined router implementation. Finally, the network and router architectures have explicit support for directory-based cache coherence, such as separate virtual channels for different coherence protocol packet classes. This helps avoid deadlocks and improves the performance of the 21364’s coherence protocol. Shubhendu S. Mukherjee

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Alpha 21364 to Ease Memory Bottleneck: 10/26/98

With processor speeds rapidly approaching the gigahertz mark, the shortcomings of today’s memory architectures are becoming all too apparent. Waiting a few hundred nanoseconds to retrieve data from main memory is tolerable for a 100-MHz CPU, but this delay turns into hundreds of cycles for a 1-GHz processor. Compaq’s Alpha processors are likely to be the first to reach that speed, and at this m...

متن کامل

Testability Features of the Alpha 21364 Microprocessor

The custom testability strategy of the Alpha 21364, Hewlett-Packard’s most recent Alpha microprocessor, builds upon its Alpha 21264 embedded core. Several additional DFT features integrate to meet the testing challenges of the new generation.

متن کامل

A Power Model for Routers: Modeling Alpha 21364 and InfiniBand Routers

As interconnection networks proliferate to many new applications, a low-latency high-throughput fabric is no longer sufficient. Applications are becoming powerconstrained. In this paper, we propose an architecturallevel power model for interconnection network routers that will allow researchers and designers to easily factor in power when exploring architectural trade-offs. We applied our model...

متن کامل

Covalent hydration energies for purine analogs by quantum chemical methods

In this work, covalent hydration energies for a variety of azanaphthalenes and purine analogs have been calculated using a variety of quantum chemical methods. On the basis of these results, we recommend the CPCM(UA0)-B3LYP/6-31+G(d,p) level for rapid prediction of covalent hydration energies. However, we caution the use of this methodology for computing covalent hydration energies for fluorine...

متن کامل

Predicting the coefficients of the Daubert and Danner correlation using a neural network model

In the present research, three different architectures were investigated to predict the coefficients of the Daubert and Danner equation for calculation of saturated liquid density. The first architecture with 4 network input parameters including critical temperature, critical pressure, critical volume and molecular weight, the second architecture with 6 network input parameters including the on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001